Computational clustering for viral reference proteomes

نویسندگان

  • Chuming Chen
  • Hongzhan Huang
  • Raja Mazumder
  • Darren A. Natale
  • Peter B. McGarvey
  • Jian Zhang
  • Shawn W. Polson
  • Yuqi Wang
  • Cathy H. Wu
چکیده

MOTIVATION The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. RESULTS We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt's curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. AVAILABILITY AND IMPLEMENTATION http://proteininformationresource.org/rps/viruses/ CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic

The InParanoid database (http://InParanoid.sbc.su.se) provides a user interface to orthologs inferred by the InParanoid algorithm. As there are now international efforts to curate and standardize complete proteomes, we have switched to using these resources rather than gathering and curating the proteomes ourselves. InParanoid release 8 is based on the 66 reference proteomes that the 'Quest for...

متن کامل

Automatic clustering of orthologs and inparalogs shared by multiple proteomes

MOTIVATION The complete sequencing of many genomes has made it possible to identify orthologous genes descending from a common ancestor. However, reconstruction of evolutionary history over long time periods faces many challenges due to gene duplications and losses. Identification of orthologous groups shared by multiple proteomes therefore becomes a clustering problem in which an optimal compr...

متن کامل

A Case of Methotrexate Intoxication Misdiagnosed as Crimean-Congo Hemorrhagic Fever

[No Abstract] Crimean-Congo hemorrhagic fever (CCHF) is considered as the most important arboviral infection in Iran. Early diagnosis of CCHF is essential for preventing the spread of the infection and providing appropriate treatment to patients. Given that clinical symptoms of CCHF may overlap with other common infectious disease; differential diagnosis is a matter of great importance. In this...

متن کامل

Small molecule affinity fingerprinting. A tool for enzyme family subclassification, target identification, and inhibitor design.

Classifying proteins into functionally distinct families based only on primary sequence information remains a difficult task. We describe here a method to generate a large data set of small molecule affinity fingerprints for a group of closely related enzymes, the papain family of cysteine proteases. Binding data was generated for a library of inhibitors based on the ability of each compound to...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 32 13  شماره 

صفحات  -

تاریخ انتشار 2016